Document Level Sentiment Analysis with Deep Learning Models
Contents
Document Level Sentiment Analysis with Deep Learning Models#
Load Twitter Datasets#
Twemlab Goldstandard dataset
Size: 994
Polarity: (1 pos, 0 neu, -1 neg)
Sample of the data:
| Text | Label | sentiment_label | |
|---|---|---|---|
| 472 | Woman and her dog fight off sex attacker in Kings Norton park | anger/disgust | -1 |
| 970 | Birmingham are consulting on extending shared-use paths in Sheldon Park an important cycle link to Marston Green. | none | 0 |
| 145 | birminghammail-new horror book set around the Lickey Hills. | none | 0 |
SemEval Goldstandard dataset size: 3713
Polarity: (1 pos, 0 neu, -1 neg)
Sample of the data:
| Text | Polarity | sentiment | |
|---|---|---|---|
| 2811 | Monday nights are a bargain at the $28 prix fix - this includes a three course meal plus *three* glasses of wine paired with each course. | positive | 1 |
| 886 | very good breads as well. | positive | 1 |
| 1270 | On the other hand, if you are not fooled easily, you will find hundreds of restaurants that will give you service and ambiance that is on par with Alain Ducasse, and food that will outshine in presentaion, taste, choice, quality and quantity. | negative | -1 |
AIFER example dataset size: 11177
Polarity: none (unlabelled data)
Sample of the data:
| Date | Text | Language | |
|---|---|---|---|
| 6519 | 2021-07-20 16:31:14 | @karlheinz_e Da liegt er falsch. Hier haben alte links-grüne-versiffte Soldaten mit grünen Gutmenschen u.a. auch Flüchtlinge unterstützt. AfD-Politiker habe ich hier nicht gesehen. | de |
| 7712 | 2021-07-22 22:02:41 | @MDegen55 All Nathan Fillion Fans of the World. Good Night and happy Friday. ♥️ https://t.co/PtWuzODGwB | en |
| 11116 | 2021-07-21 18:20:17 | @CWerdinger Hier im Katastrophengebiet spielt Corona keine Rolle mehr,auch kein Hass oder Hetze, hier zählt nur noch Hilfsbereitschaft ❤️❤️❤️Masken interessieren hier niemanden mehr, wer Maske tragen will kann das hier im Katastrophengebiet tun, wer nicht brauch auch keine zu tragen. 👍☺️ | de |
Transformers-based NLP Models#
Why Transformers?#
In recent years, the transformer model has revolutionized the field of NLP. This ‘new’ deep learning approach has been highly successful in a variety NLP tasks, including sentiment analysis. The transformer model offers several advantages over traditional machine learning and even other deep learning approaches and have been shown to outperform traditional machine learning and other deep learning methods on NLP tasks, particularly sentiment analysis. Some of the key advantages it has are:
The encoder-decoder framework: Encoder generates a representation of the input (semantic, context, positional) and the decoder generates output. Common use case: sequence to sequence translation tasks.
Attention mechanisms: Deals with the information bottleneck of the traditional encoder-decoder architecture (where one final encoder hidden state is passed to decoder) by allowing the decoder to access the hidden states at each step and being able to prioritise which state is most relevant.
Transfer learning (i.e. fine-tuning a pre-trained language model)

A note on Attention#
In transformers, multi-head scaled-dot product attention is usually used. This attention mechanism allows the Transformer to capture global dependencies between different positions in the input sequence, and to weigh the importance of different parts of the input when making predictions.
In scaled dot-product attention a dot product between the query, key, and value vectors is computed for each position in the sequence. The attention mechanism is repeated multiple times with different linear projections (hence “multi-head”) to capture different representations of the input.
Code implementation
class AttentionHead(nn.Module):
def __init__(self, embed_dim, head_dim):
super().__init__()
self.q = nn.Linear(embed_dim, head_dim)
self.k = nn.Linear(embed_dim, head_dim)
self.v = nn.Linear(embed_dim, head_dim)
def forward(self, hidden_state):
attn_outputs = scaled_dot_product_attention(
self.q(hidden_state), self.k(hidden_state), self.v(hidden_state))
return attn_outputs
class MultiHeadAttention(nn.Module):
def __init__(self, config):
super().__init__()
embed_dim = config.hidden_size
num_heads = config.num_attention_heads
head_dim = embed_dim // num_heads
self.heads = nn.ModuleList(
[AttentionHead(embed_dim, head_dim) for _ in range(num_heads)]
)
self.output_linear = nn.Linear(embed_dim, embed_dim)
def forward(self, hidden_state):
x = torch.cat([h(hidden_state) for h in self.heads], dim=-1)
x = self.output_linear(x)
return x
Here’s a visual representation of the attention machism at work with a demo text “The hurricane trashed our entire garden”:
from IPython.display import display, HTML
display(HTML('https://raw.githubusercontent.com/Christina1281995/demo-repo/main/neuron_view.html'))